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SELF-TUITION  AS  A  METHOD  OF  AEMINISTERI1IO  THE 
SEMUrnC  TEST  OF  INYELUGINCB,  NQT-4 


PURPOSE  OF  THE  STUDY 

The  Nonlanguage  Qualification  Test,  NQT-2  and  -3,  an  experimental 
test  developed  for  the  Army  by  -Naloa  (1 959),  was  designed  to  replace 
NQT-1.  Thus,  its  main  function  in  the  screening  process  was  to  be 
assessment  of  the  military  tra inability  of  inductees  who  have  failed  the 
AFQT  and  demonstrated  a  low  level  of  literacy  on  the  Verbal-Arithmetic 
subtest  (VA).  However,  tryouts  of  NQT-2  and  -3  revealed  several 
difficulties  in  administration.^  The  instructions  were  given  in  pantonine 
by  the  examiner  the  test  employed  a  format  which  involved  frequent 
repetitions  of  a  three-phase  cycle: 


1.  Examiner  explained  and  demonstrated  the  task. 


2.  Examinee  coopleted  non-scored  practice  items. 


3.  Examinee  marked  test  items  proper. 


The  pantonine  instructions  required  that  examiners  be  given  special 
training.  Since  examinees  found  these  pantomime  Instructions  difficult 
to  understand,  one  proctor  had  to  be  provided  for  every  five  examinees. 
Ibe  new  format  and  length  of  the  test  resulted  in  a  test  administration 
lasting  2  1/4  to  2  1/2  hours. 

To  overcome  these  difficulties,  NQT-2  and  -3  were  modified  to 
permit  self-tuition,  by  which  examinees  obtain  immediate  knowledge  of 
the  correctness  of  their  responses  without  assistance  from  the  proctor 
(Mundy,  Tye,  and  Schenkel,  1956).  Self -tuition  was  believed  to  aid  the 
examinee  in  discovering  the  nature  of  the  task  without  undue  assistance 
from  the  proctor.  After  exploring  various  techniques  for  providing 
self-tuition  (mechanical,  electrical  and  chemical) ,  a  chemical  reaction 
method  was  adopted.  Pages  of  the  test  booklet,  which  also  serve  as  the 
answer  sheet,  were  chemically  treated  so  that  the  examinee's  mark— made 
with  a  special  pen— was  red  for  wrong  alternatives  and  blue  for  correct 
alternatives.  Only  20  pages  from  the  42  pages  of  NQT-2  were  processed 
in  this  manner,  lhe  new  test  was  designated  NQT-4X. 

Quite  obviously  the^ NQT-4X  could  not  be  made  operational  until  its 
correlation  with  NQT-1  (which  it  was  designed  to  replace)  was  determined 
and  alternative  methods  of  scoring  had  been  evaluated.  A  secondary 
purpose  of  the  study  was  to  determine  if  the  self -tuition  feature  of 
NQT-4X  and  the  novel  form  of  the  items  were  valuable  enough  to  warrant 
considering  the  incorporation  of  these  features  in  other  screening 
tests. 


""  •  Th#  ywienfc-  report  describes  analysis  of  data  obtained  in  an 
exploratory  administration  of  NQT-4X  to  determine  the  feasibility  of 
the  technique  employed.  If  the  results  were  favorable,  a  full-scale 
validation  and  standardization  would  be  undertaken. ^ 


CONTENT  AND  FORMAT 


Fundamentally,  the  NQT-4X  is  a  test  of  the  ability  to  learn  the 
symbols  of  an  artifical  language  (Table  1).  In  the  first  twelve 
pages  of  the  test,  nine  symbols  are  defined  pictorially.  After  each 
definition  is  a  set  of  items  which  require  the  examinee  to  select  from 
five  pictures  the  one  picture  which  correctly  defines  the  symbol.  The 
symbols  defined  in  these  twelve  pages  represent  three  nouns:  boy,  woman* 
dog;  three  intransitive  verbs:  running,  lying,  walking;  and  three 
transitive  verbs:  striking,  pulling,  chasing.  Hie  test  items  which 
follow  (pages  13  and  14)  test  for  the  definition  of  all  nine  symbols. 

More  complex  items  follow  (pages  13  and  l6)  in  which  two  symbols,  a 
noun  and  verb,  are  put  together  so  as  to  make  a  sentence;  the  examinee 
oust  select  the  correct  picture  to  represent  the  meaning  of  the 
sentence.  The  items  progress  to  three-symbol  sentences  (pages  17  and 
18)  and  four- symbol  sentences  (pages  19  and  20).  Since  definitions 
of  symbols  needed  to  answer  a  set  of  items  always  reappear  with  the 
set  of  items,  the  test  does  not  require  the  examinee  to  memorize  the 
symbol  definition;  of  course,  some  degree  of  memorization  will  facilitate 
the  speed  with  which  the  test  items  are  solved. 

Since  NQT-4X  items  are  of  the  self-tuitional  variety,  verbal  • ' 
directions  (not  previously  given)  and  one  page  of  six  demonstration 
items  (the  first  page  of  the  test  booklet)  were  provided  at  the 
beginning  of  the  test.  No  other  guidance  was  offered  throughout  the 
test.  One  and  one-half  hours  was  allowed  for  taking  the  test. 

In  general,  the  nature  of  the  items  and  the  format  represent  an 
effort  to  construct  a  test  of  ability  which  is  to  a  large  extent  free 
from  the  differential  influence  of  past  experience.  No  examinee  could 
have  b«d  the  opportunity  to  learn  the  meaning  of  the  symbols  prior  to 
taking  the  test.  Each  examinee  was  given  an  equal  number  of  practice 
trials  in  which  to  learn  each  symbol  tested  for.  Following  Rulon, 
this  format  is  referred  to  as  the  semantic  test  format. 


METHOD 


The  basic  ensign  of  this  study  involved  determining  the  correlation 
coefficients  between  varying  combination  of  NQT-4X  pages,  scored  by 
various  formulae,  and  the  currently  operational  NQT-1. 


Table  1 


CONTENT  OF  NQT-UX 


1 

*•& 

Picture 
Definition  of 
New  Symbol 

Definitions 

Repeated 

Symbol 

Tested 

For 

No.  Symbols 
In  Lead 

For  Items 

No. 

Items 

1 

dog 

none 

dog 

1 

6 

2 

woman 

none 

woman 

1 

6 

3 

none 

dog,  woman 

dog,  woman 

1 

7 

4 

boy 

none 

boy 

1 

6 

5 

none 

dog,  woman,  boy 

dog,  woman,  boy 

1 

7 

6 

running 

none 

running 

1 

6 

7 

none 

dog,  woman 

dog,  woman 

1 

6 

boy,  running 

boy,  running 

8 

lying 

none 

lying 

1 

6 

9 

walking 

none 

walking 

1 

6 

10 

striking 

none 

striking 

1 

6 

11 

pulling 

none 

pulling 

1 

6 

12 

chasing 

none 

chasing 

1 

6 

13,1* 

none 

all  previous 

all  previous 

1 

14 

15,16 

none 

all  previous 

all  previous 

2- symbol 
sentences 

14 

17,18 

none 

1 

all  previous 
except  running, 
lying,  walking 

all  previous 
except  running, 
lying,  walking 

3- symbol 
sentences 

13 

19,20 

1 

none 

i 

j  all  previous 

1 

i 

all  previous 

4- symbol 
sentences 

13 

-  3  - 

i 

V. 


SAMPLE 


The  appropriate  population  for  evaluating  NQT-kX  would  have  been 
aen  who  failed  the  AFQT  and  the  VA  subtest,  that  la,  the  sen  to  whan 
NQT-1  is  normally  given.  However,  In  such  a  population  the  range  of 
AFQT  scores  Is  severely  restricted.  Since  a  somewhat  larger  range 
of  scores  on  AFQT  was  desirable,  a  saaple  of  22k  cases  was  selected 
in  which  AFQT  percentile  scores  ranged  from  1  to  19;  the  sample  was 
not  United  to  VA  subtest  failures.  IXirlng  JUly  and  February  1959, 
the  NQT-^X  was  administered  to  Selective  Service  registrants  at  the 
Atlanta,  Newark,  and  New  York  AFES.  Usable  cases  were  divided  Into 
three  sub samples : 

Sub sample  A.  AFQT  percentile  scores  10-19  (N  ■  35) 

Sub sample  B.  AFQT  percentile  scores  3-9  (N  ■  86) 

Subsample  C.  AFQT  percentile  scores  1-U  (N  -  105) 


VARIABLES 

As  a  basis  for  analysis  to  determine  an  appropriate  scoring 
formula  for  NQT-4X  items  and  to  select  NQT-Ux  pages  which  should  be 
scored,  several  alternative  scoring  procedures  were  established  and 
a  representative  set  of  pages  was  selected. 

Scoring  procedures.  Traditional  scoring  formulas  developed  to 
correct  for  chance  success  were  not  applicable  to  NQT-4X  items  because 
an  examinee  was  allowed  to  continue  marking  alternatives  until  the 
correct  answer  was  achieved.  The  three  scoring  methods  explored  in  the 
present  analysis  are  described  below: 

1.  Rights  only.  Credit  was  given  only  for  those  items  which  the 
examinee  got  right  on  the  first  trial.  Items  marked  correctly  after 
the  first  trial  were  not  scored. 

2.  Total  errors.  Aiother  procedure  entailed  counting  the  number 
of  errors  made  per  item  i:.  achieving  a  correct  answer,  and  utilising  the 
total  number  of  errors  made  as  the  examinee's  score. 

5.  Corrected  errors.  In  neither  of  the  above  procedures  was 
allowance  made  for  omitted  ltema.  To  make  such  allowance,  an  error 
weight  was  assigned  to  oadtted  items  with  the  assumption  that  omitted 
items,  had  they  been  attempted,  would  have  been  answered  with  chance 
success.  On  a  chance  basis,  one-fifth  of  those  attempting  an  Item 
would  make  no  errors,  one-fifth  would  make  one  error,  one -fifth  would 
make  two  errors,  one-fifth  would  make  three  errors,  and  one-fifth  would 
make  four  errors.  The  median  number  of  errors  for  a  group  answering  the 
item  on  a  chance  basis  would  thus  be  2,  which  was  taken  as  the  expected 
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number  of  errors  for  an  omitted  item.  The  formula  for  correcting  errors 
to  allow  for  omits  is,  then, 

E  »  E  +  2  x  0,  where 

c 

E  •  corrected  errors 
c 


E  -  total  errors 
0  »  number  of  omitted  items 


of  the  average  p- value 


Pages  selected  for  scoring.  On  the  basis  _  _ 

computed  for  each  page  in  the  toted  sample,  the  following  pages  of 
NQT-UX  were  selected  as  being  the  most  meaningful  for  statistical 
analysis: 


Average  p  value 


5,  6,  and  7  combined  (symbol  definition)  .73 
13  and  14  combined  (symbol  definition)  *56 
15  and  16  combined  (Two-symbol  sentences)  .32 
17  and  18  combined  (Three-symbol  sentences)  .25 
19  and  20  combined  (Four-symbol  sentences)  .22 
5,  6,  7  and  13-20  combined 


Since  three  scoring  formulas  were  employed  on  each  page,  each  of 
the  page  combinations  yielded  three  scores:  rights  only,  total  errors, 
frnfl  corrected  errors.  Table  2  summarizes  the  experimental  variables 
(p age  combinations  and  scoring  formula)  analyzed  in  this  study. 

Reference  Variables.  NQT-1  and  VA  subtest  scores  were  obtained. 


RESULTS 

Table  2  gives  the  product  moment  correlation  coefficients  of  the 
experimental  variables  with  the  reference  variables.  None  of  the 
variously  scored  page  combinations  yielded  satisfactory  coefficients 
with  NQT-1,  even  though  many  of  the  coefficients  were  significantly 
different  from  zero.  Accordingly,  no  further  Inquiry  into  the  matter 
of  selecting  scoring  formulae  or  page  combinations  to  be  employed  with 
NQT-l+X  was  considered  appropriate.  Further  refinements  of  the  NQT-UX 
as  a  replacement  for  NQT-1  would  appear  to  be  futile. 
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fovever,  since  JIQT-4X  represents  a  unique  effort  In  test  construction, 
the  were  examined  for  any  possible  light  on  the  potential  usefulness 
of  the  self-tuition  principle  and  the  semantic  format  of  the  test. 

Because  a  control  group  was  not  included  in  the  design,  the  present 
study  did  not  permit  a  rigorous  evaluation  of  the  self -tuition  principle. 
Previously  published  results  provide  the  only  baseline  available  for 
evaluating  immediate  knowledge  of  results  as  a  test  technique. 

Previous  research  with  the  self -tuition  principle  yielded  evidence 
implying  that  immediate  knowledge  of  results  given  early  in  a  series 
of  items  of  the  same  type  would  lead  to  an  Increase  in  number  of  rights 
for  the  later  items  of  the  same  series.  Examination  of  the  number  of 
correct  responses  to  a  series  of  items  of  the  same  type  failed  to  confirm 
this  expectation.  There  were  six  test  items  for  each  of  the  eight 
symbols  defined  in  the  test  (excluding  the  first  page  which  was  not 
scores).  No  rapid,  abrupt,  or  even  consistent  improvement  was  observed 
from  item  to  item  (Table  3). 

Why  did  immediate  knowledge  of  results  fail  to  confirm  expectations 
derived  from  previous  research?  In  the  first  place,  it  may  be  that 
the  AFQT  failures  used  in  this  sample  did  not  profit  from  the  type 
of  knowledge  of  results  given  in  the  test.  For  a  low  level  examinee, 
knowing  that  he  is  right  or  wrong  may  not  be  as  important  as  knowing 
why  he  was  right  or  wrong.  In  the  second  place,  Inmediate  knowledge 
of results  may  have  failed  to  operate  as  expected  because  nowhere  in  the 
test  was  the  examinee  given  a  sufficient  number  of  trials  with  any 
particular  symbol.  Inmediate  knowledge  of  results  may  not  produce  a 
noticeable  effect  when  the  number  of  trials  is  small.  In  the  third 
place,  the  color  difference  between  right  and  wrong  responses  was  not 
sharply  defined,  although  this  difference  became  more  sharply  differenti¬ 
ated  with  the  passage  of  time.  The  technical  inadequacy  of  the  color 
coding  process  may  have  contributed  to  failure  of  inmediate  knowledge 
of  results.  Finally,  it  is  not  inconceivable  that  the  effectiveness  of 
Inmediate  knowledge  of  results  is  dependent  upon  the  type  of  learning 
task  on  which  it  is  applied.  The  items  of  the  NQT-^X  may  constitute 
a  type  of  learning  task  which  is  not  facilitated  by  the  application  of 
inmediate  knowledge  of  results. 


IMPLICATIONS  FOP  FUTURE  RESEARCH 

Future  empirical  evaluation  of  the  principle  of  Inmediate  knowledge 
of  results  should  take  cognizance  of  probable  reasons  for  the  failure  of 
the  principle  in  this  study.  Notably,  if  color  coding  is  to  be  employed 
to  convey  knowledge  of  results,  colors  of  the  right  and  wrong  marks  on 
the  answer  sheet  must  be  discriminably  different  and  this  difference 
must  be  immediately  apparent  to  the  examinee.  NQT-UX  did  not  meet  this 
criteria,  and  should  not  be  used  in  further  evaluative  studies.  A  more 
efficient  means  of  providing  inmediate  knowledge  of  results  during 
test  taking— a  testing  machine  with  knowledge  of  results  programed, 
for  example — should  be  used  Instead. 


Table  2 

CORRELATION  COEFFICIENTS  OF  EXPERIMENTAL  VARIABLES 
WITH  NQT-1  AND  VA  SUBTEST  KEY 


Content 

Pages 

NQT-1 
(N  -  190) 

Scoring  Formula  r 

VA 

(N  -  121) 
r 

8ymbol  definition 

5*6,7  combined 

Rights  Only 

.40* 

.14 

Errors  Only 

-35* 

-.17 

Corrected  Errors 

-.57* 

-.17 

Symbol  definition 

15  and  14  combined 

Rights  Only 

.44* 

.17 

(all  symbols 

previously  pre- 

Errors  Only 

-.58* 

-.20* 

sented) 

Corrected  Errors 

-.41* 

-.20* 

2 -symbol  sentences  15  and  16  combined 

Rights  Only 

.21* 

.13 

Errors  Only 

aOjl 

-.16 

Corrected  Errors 

-.51* 

-.17 

5- symbol  sentences  17  and  18  combined 

Rights  Only 

.19* 

.26 

Errors  Only 

-.16* 

h 

• 

f 

Corrected  Errors 

-.20* 

-.18* 

It- symbol  sentences  19  and  20  combined 

Rights  Only 

.06 

.27* 

Errors  Only 

-.05 

-.27* 

Corrected  Errors 

-.11 

-.50* 

All  Items 

5,6,7,  and  15  thru 

Rights  Only 

.58* 

.25* 

20  combined 

All  Items 

5,6,7,  and  15  thru 

Errors  Only 

—  54* 

-.28* 

20  combined 

All  Items 

5,6,7,  and  15  thru 

Corrected  Errors 

-.58* 

-.28* 

20  combined 


♦Significant  at  p  <  .05 
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In  future  research  dealing  with  immediate  knowledge  of  results , 
the  nature  of  the  teat  items— item  difficulty,  content,  etc.— and  the 
ability  level  of  examinees  should  be  systematically  varied  in  order  to 
provide  results  from  which  adequate  generalizations  can  be  drawn. 

The  semantic  format  of  the  NQT-Ux  must  also  be  subjected  to  further 
experimertation  before  Judgment  of  its  usefulness  can  be  rendered.  One 
important  variable  in  this  experimentation  would  be  the  number  of 
practice  trials  needed  by  individuals  of  different  levels  of  ability  to 
reach  a  successful  solution  of  the  test  items.  Individuals  of  low  ability 
can  be  expected  to  require  more  practice  trials  than  individuals  of  high 
ability.  The  test  in  its  present  form  may  not  include  enough  practice 
trials  so  that  low  level  individuals  can  perform  satisfactorily  on  the 
test  problems. 

In  summary;  the  NQT-UX  should  be  abandoned  as  a  potential  screening 
device  and  as  a  research  instrument  for  evaluating  imnediate  knowledge 
of  results.  Future  research  evaluating  immediate  knowledge  of  results 
is  contingent  upon  finding  a  technically  more  adequate  method  of  providing 
knowledge  of  results.  The  research  design  should  take  into  consideration 
the  ability  level  of  the  experimental  group  and  the  nature  of  the  test 
items,  shoved  insure  an  adequate  number  of  items  of  a  certain  type. 
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